diffusion approximation
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Diffusion Approximations for Online Principal Component Estimation and Global Convergence
In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for PCA under the additional assumption of bounded samples.
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
High-Dimensional Privacy-Utility Dynamics of Noisy Stochastic Gradient Descent on Least Squares
Lin, Shurong, Kolaczyk, Eric D., Smith, Adam, Paquette, Elliot
The interplay between optimization and privacy has become a central theme in privacy-preserving machine learning. Noisy stochastic gradient descent (SGD) has emerged as a cornerstone algorithm, particularly in large-scale settings. These variants of gradient methods inject carefully calibrated noise into each update to achieve differential privacy, the gold standard notion of rigorous privacy guarantees. Prior work primarily provides various bounds on statistical risk and privacy loss for noisy SGD, yet the \textit{exact} behavior of the process remains unclear, particularly in high-dimensional settings. This work leverages a diffusion approach to analyze noisy SGD precisely, providing a continuous-time perspective that captures both statistical risk evolution and privacy loss dynamics in high dimensions. Moreover, we study a variant of noisy SGD that does not require explicit knowledge of gradient sensitivity, unlike existing work that assumes or enforces sensitivity through gradient clipping. Specifically, we focus on the least squares problem with $\ell_2$ regularization.
- North America > United States > Pennsylvania (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Achieving O(1/N) Optimality Gap in Restless Bandits through Diffusion Approximation
Yan, Chen, Wang, Weina, Ying, Lei
The Restless Multi-Armed Bandit (RMAB) problem is a fundamental framework in decision theory and operations research, where a decision maker must choose which among multiple tasks (arms) to work on (pull) at each time step in order to maximize cumulative reward [24]. Unlike the classic bandit problem [14], in the restless variant, the state of each arm evolves stochastically regardless of whether it is pulled. This problem has gained significant attention due to its applicability in various domains where optimal decision-making under uncertainty is critical, such as machine maintenance [11], target tracking [17], network communication [18] and clinic trials [22], to name a few. Despite its relevance, the RMAB problem is known to be PSPACE-hard [19], and finding optimal policies is computationally challenging, especially when the number of arms N is large. In this paper, we focus on the finite horizon version of the RMAB problem with N homogeneous arms and horizon H, where each arm follows the same (time-dependent) state transition and reward function. While computing the exact optimal policy is impractical, the homogeneity of the model allows for the design of efficient heuristic policies. One such class of heuristics is based on fluid approximation, which transforms the original N-armed RMAB problem into a Linear Program (LP).
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- (2 more...)
Reviews: Diffusion Approximations for Online Principal Component Estimation and Global Convergence
They show it can be approximated using a stochastic process and provide global convergence guarantees which match the optimal . Although the related work section is very dense, it is well organized and serves as a good starting point for the non-expert reader. As a non expert in this area I find it quite hard to judge the merits of this paper in detail. Having said that it seems that providing the first global convergence guarantees for Oja's rule ought to be of interest. The analysis seems to thoroughly explain the different phases in the optimization procedure which perhaps could be useful as a diagnostic tool.
Approximating G(t)/GI/1 queues with deep learning
Sherzer, Eliran, Baron, Opher, Krass, Dmitry, Resheff, Yehezkel
In this paper, we apply a supervised machine-learning approach to solve a fundamental problem in queueing theory: estimating the transient distribution of the number in the system for a G(t)/GI/1. We develop a neural network mechanism that provides a fast and accurate predictor of these distributions for moderate horizon lengths and practical settings. It is based on using a Recurrent Neural Network (RNN) architecture based on the first several moments of the time-dependant inter-arrival and the stationary service time distributions; we call it the Moment-Based Recurrent Neural Network (RNN) method (MBRNN ). Our empirical study suggests MBRNN requires only the first four inter-arrival and service time moments. We use simulation to generate a substantial training dataset and present a thorough performance evaluation to examine the accuracy of our method using two different test sets. We show that even under the configuration with the worst performance errors, the mean number of customers over the entire timeline has an error of less than 3%. While simulation modeling can achieve high accuracy, the advantage of the MBRNN over simulation is runtime, while the MBRNN analyzes hundreds of systems within a fraction of a second. This paper focuses on a G(t)/GI/1; however, the MBRNN approach demonstrated here can be extended to other queueing systems, as the training data labeling is based on simulations (which can be applied to more complex systems) and the training is based on deep learning, which can capture very complex time sequence tasks. In summary, the MBRNN can potentially revolutionize our ability to perform transient analyses of queueing systems.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Taiwan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.48)
- Health & Medicine (0.67)
- Transportation (0.46)
Diffusion Generative Modelling for Divide-and-Conquer MCMC
Trojan, C., Fearnhead, P., Nemeth, C.
Divide-and-conquer MCMC is a strategy for parallelising Markov Chain Monte Carlo sampling by running independent samplers on disjoint subsets of a dataset and merging their output. An ongoing challenge in the literature is to efficiently perform this merging without imposing distributional assumptions on the posteriors. We propose using diffusion generative modelling to fit density approximations to the subposterior distributions. This approach outperforms existing methods on challenging merging problems, while its computational cost scales more efficiently to high dimensional problems than existing density estimation approaches.
- Europe > United Kingdom (0.04)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
Stochastic Modified Flows for Riemannian Stochastic Gradient Descent
Gess, Benjamin, Kassing, Sebastian, Rana, Nimit
We give quantitative estimates for the rate of convergence of Riemannian stochastic gradient descent (RSGD) to Riemannian gradient flow and to a diffusion process, the so-called Riemannian stochastic modified flow (RSMF). Using tools from stochastic differential geometry we show that, in the small learning rate regime, RSGD can be approximated by the solution to the RSMF driven by an infinite-dimensional Wiener process. The RSMF accounts for the random fluctuations of RSGD and, thereby, increases the order of approximation compared to the deterministic Riemannian gradient flow. The RSGD is build using the concept of a retraction map, that is, a cost efficient approximation of the exponential map, and we prove quantitative bounds for the weak error of the diffusion approximation under assumptions on the retraction map, the geometry of the manifold, and the random estimators of the gradient.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- (5 more...)